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RNAnet provides a bridge between two widely used Hu- 
man gene databases. Ensembl describes DNA sequences and 
transcripts but not experimental gene expression. Whilst 
GEO contains actual expression levels from many thousands 
of Human samples but, although samples in GEO can be 
queried by experiment, comparison across the whole of GEO 
is not supported. RNAnet provides immediate access to 
thousands of Affymetrix HG-U133 2+ measurements pro- 
vided by GEO covering Human genes in most medically in- 
teresting tissues. Data have been quantile normalised and 
scanned for a variety of common GeneChip errors. There 
are copious links back into GEO and Ensembl. 

Without RNAnet comparison across experiments in GEO 
is very labour intensive requiring down loading and manual 
cleaning of data files for each microarray in each experiment. 
Previously normalising more than a few dozens GeneChips 
was tricky. Having downloaded tens of thousands of mi- 
croarray datasets, before RNAnet, we could normalise all 
HG-U133 2+ in ten hours. With RNAnet anyone can access 
cleaned quantile normalised data in seconds. 

Further, since we have data from across many different tis- 
sues and medical conditions GEO data can be used to find 
patterns of co-expression. RNAnet shows that the network 
of strong correlations is huge but sparse. Thousands of genes 
interact strongly with thousands of others. Conversely, tens 
of thousands of genes interact strongly with less than 100 
others. I.e. RNAnet gives new views for RNA Systems Bi- 
ology. It builds on free but very valuable databases. 

1. USING NORMALISED GEO DATA 

The URL http://bioinformatics.essex.ac.uk/users/ 
wlangdon/rnanet/probeset .php? followed by an Affymetrix 
probeset identifier (e.g. 1556291_at) immediately gives a ta- 
ble of all the loge normalised values for the probeset. Ex- 
ceptional or suspect values are flagged with a "?". The table 
can be loaded into a spread sheet or other analysis tools. 
"Perfect match" data (PM) have a clickable label pointing 
to their GEO experiment. "Mis-match" data (MM) have 
similar hyperlinks which take you directly to the GEO de- 
scription of the individual tissue sample. 

Users of Firefox can interactively plot data, either from 
the same or different probesets using http : //bioinf ormatics 
essex . ac .uk/users/wlangdon/rnanet/ scatter .html The 
crosshairs provides access to individual values and hyper- 
links into the metadata held by GEO. Probes can either be 
specified by Affymetrix probeset id or Ensembl exon id or a 
mixture. Colour allows multiple plots on the same graph. 


2. INTERACTIVE CORRELATION HEAT MAPS 

The web site contains many tens of thousands of pre-calculated 
heat maps for Mouse, Arabidopsis, Rice and Soybean, as well 
as Human. For Human genes, RNAnet also supports flexible 
interactive construction of correlation heat maps. Any set 
of probes can be correlated either if the probes map in sense, 
antisense or in both directions to the exon. The probes can 
be from the same or different probesets and the heat maps 
can be of any size. Typically 10 x 10 correlations takes 
about a second to calculate and display. Again GeneChip 
data may be requested either by Affymetrix probeset or En- 
sembl exon id. (Usually several Affymetrix probes measure 
an exon. The one chosen as being typical, i.e. most corre- 
lated, is indicated with an asterisk *.) Correlations follow 
the same colour coding as the fixed matrices. Hyperlinks on 
the matrix lead to the underlying scatter plot. (Remember 
to press [plot |) Additionally the text button displays the 
averages and correlations in numeric form. 

3. RNA SYSTEMS BIOLOGY 

We calculated the correlations across all of GEO of 24 132 
exons with each other. The main RNAnet graphical screen 
allows Firefox users to query these 290 million correlation 
coefficients by gene name or Ensembl exon id. Strong corre- 
lations or anti-correlations can be plotted on a PCA analysis 
of the 290 million correlations. Additionally up to ten exons 
closest to the dragable crosshairs (cf. sect. [1} can be dis- 
played. Once an exon is selected it should be locked into 
the display (2 nd box on right) to avoid the next search over 
writing it. Again heat maps are created and displayed as 
needed. Due to non-unique mappings between Ensembl ex- 
ons and Affymetrix probesets and concerns about sequence 
quality, correlations for only 24 132 Ensembl exons are avail- 
able. They represent about half the Human genes. 

To Dec 2009, 3 585 pages had been loaded by people (ex- 
cluding Essex and King's). In the last 15 months there have 
been « 250 down loads per month. RNAnet was publicised 
at UK Affy 2008 and a poster presented at IEMBL 20081 
Cf. technical report [1]. RNAnet was used to corroborate 
experimental results on Mycoplasma contamination [2|. 
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